Fix flaky parallel tools test by removing non-deterministic synchronization#161
Merged
Fix flaky parallel tools test by removing non-deterministic synchronization#161
Conversation
1 task
…zation The testShouldExecuteMultipleCustomToolsInParallelSingleTurn test used CountDownLatch barriers to verify that tool handlers overlapped in execution. This caused a race condition: both handlers completed simultaneously after the barrier was released, and the order in which tool results were sent back to the CLI was non-deterministic. When results arrived in a different order than the snapshot expected (toolcall_1 before toolcall_0), the proxy returned a 500 error. The fix simplifies the test to match the reference implementation approach: tools return immediately, and we verify both tools were called and the response contains both results. The SDK still dispatches tools concurrently via its executor; the test just no longer forces a specific timing that causes ordering issues. Fixes #158 Co-authored-by: edburns <75821+edburns@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Investigate and resolve failed test run after merging #157
Fix flaky parallel tools test by removing non-deterministic synchronization
May 5, 2026
edburns
approved these changes
May 5, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the E2E tools test suite to remove synchronization that made the parallel-tools snapshot replay flaky. It aims to keep the Java SDK aligned with the reference implementation while making the test deterministic against the strict replaying proxy.
Changes:
- Simplifies
testShouldExecuteMultipleCustomToolsInParallelSingleTurnso both tool handlers return immediately. - Removes latch/atomic coordination code used to force handler overlap.
- Keeps the test focused on successful dual-tool invocation and combined assistant output.
Show a summary per file
| File | Description |
|---|---|
src/test/java/com/github/copilot/sdk/ToolsTest.java |
Simplifies the parallel custom tools E2E test by removing forced synchronization and overlap assertions. |
Copilot's findings
- Files reviewed: 1/1 changed files
- Comments generated: 1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves #158
Before the change?
testShouldExecuteMultipleCustomToolsInParallelSingleTurnusedCountDownLatchbarriers to force both tool handlers to overlap, then released them simultaneously. The completion order was non-deterministic — sometimestoolcall_1result was sent to the CLI beforetoolcall_0. The replaying proxy performs strict message-order matching, so when tool results arrived out of snapshot order, it returned 500.After the change?
Pull request checklist
mvn spotless:applyhas been run to format the codemvn clean verifypasses locallyDoes this introduce a breaking change?